Comparative analysis and evaluation of wild and cultivated Radix Fici Simplicissimae using an UHPLC-Q-Orbitrap mass spectrometry-based metabolomics approach

Radix Fici Simplicissimae (RFS) is widely studied, and is in demand for its value in medicines and food products, with increased scientific focus on its cultivation and breeding. We used ultra-high-performance liquid chromatography quadrupole-orbitrap mass spectrometry-based metabolomics to elucidate the similarities and differences in phytochemical compositions of wild Radix Fici Simplicissimae (WRFS) and cultivated Radix Fici Simplicissimae (CRFS). Untargeted metabolomic analysis was performed with multivariate statistical analysis and heat maps to identify the differences. Eighty one compounds were identified from WRFS and CRFS samples. Principal component analysis and orthogonal partial least squares discrimination analysis indicated that mass spectrometry could effectively distinguish WRFS from CRFS. Among these, 17 potential biomarkers with high metabolic contents could distinguish between the two varieties, including seven phenylpropanoids, three flavonoids, one flavonol, one alkaloid, one glycoside, and four organic acids. Notably, psoralen, apigenin, and bergapten, essential metabolites that play a substantial pharmacological role in RFS, are upregulated in WRFS. WRFS and CRFS are rich in phytochemicals and are similar in terms of the compounds they contain. These findings highlight the effects of different growth environments and drug varieties on secondary metabolite compositions and provide support for targeted breeding for improved CRFS varieties.


Plant materials
Qingyuan City is in the mountainous area of northern Guangdong Province and is a central residential area of the Yao nationality in China.In this study, 29 batches of RFS were harvested.Seventeen batches of WRFS samples were collected from Qingyuan City (Guangdong Province, China), and twelve CRFS samples were collected from hospitals and pharmacies (Table 1).Professor Yuan Xiaohong of Guangdong Provincial Hospital of Traditional Chinese Medicine identified all the medicinal materials.Among them, WRFS were provided by Qingyuan Traditional Chinese Medicine Hospital in May 2022, and CRFS were provided by Guangzhou First Affiliated Hospital of Traditional Chinese Medicine in June 2022.All samples were collected with the approvals from the respective authorities.The phenotypes of RFS are shown in Fig. 1.

Ethics statement
Collection of Radix Fici Simplicissimae in this research material conforms to and complies with the IUCN Policy Statement on Research Involving Species at Risk of Extinction and the Convention on the Trade in Endangered Species of Wild Fauna and Flora.In addition, according to the List of National Key Protected Wild Plants issued by the State Forestry and Grassland Bureau of China, Radix Fici Simplicissimae, the experimental material of this study, is not a national key protected wild plant nor an endangered plant species.

Sample preparation and extraction
The samples were ground and sieved (Chinese National Standard Sieve No. 3, R40/3 series) to obtain a homogeneous powder.Then, 0.1 g dried samples were added to a 5 mL volumetric flask, and 5 mL of 50% methanol was added.The mixture was left standing for 60 min and extracted using ultrasound (350 W, 35 kHz) (SK3300LH Ultrasonic Cleaner (Shanghai Kedao Ultrasonic Instrument Co., Ltd.)) for 60 min at 37 °C.Methanol (50%) was added to compensate for the loss in weight.The mixture was centrifuged (13,000 rpm; Thermo Legend Micro17R Centrifuge) for 10 min to obtain a clear solution.Additionally, a reference solution of psoralen and apigenin was prepared using the same method.
To ensure the suitability and stability consistency of MS analysis, a QC sample was prepared by pooling the same volume (10 µL) from every sample.In the entire worklist, one QC sample was inserted into every five test and analysis samples, and six QC injections were given to monitor the repeatability of the analysis.A volume of 3 µL was injected for each sample and QC.Metabolite extraction and detection repeatability were determined by overlapping the total ion flow diagram of MS detection and analysis of different QC samples.

Mass spectrometry
The positive mode conditions were as follows: capillary voltage, 4.00 kV; carrier gas, nitrogen; sheath gas pressure, 3.5 MPa; auxiliary gas pressure, 1.0 MPa; capillary temperature: 320 °C; auxiliary gas heating temperature: 320 °C; primary resolution: 70,000.The negative mode conditions were identical to the positive mode conditions except for the capillary voltage (3.00 kV).The full scan mode was used, and positive and negative ions were detected simultaneously.The scanning range of the positive and negative ion spectra recorded by MS was 80-1200 m/z.

Chemical component identification
For data collection, the samples were detected simultaneously in the first and second scanning modes under positive and negative ions, respectively, using UPLC-Q-Orbitrap HRMS (Thermo Fisher Scientific), and a total ion flow diagram was plotted.According to the pyrolysis spectrum detected in the electrostatic field orbital well analyzer, the accurate relative molecular weight, retention time, and multistage fragment ion information of the compound were obtained using a Compound Discoverer 3.2.The parameters were as follows: for 2D peak detection, 200 was set as the minimum peak area; for 3D peak detection, the peak intensities of low and high energy were set as > 1000 and > 200 counts, respectively; mass error in the range of ± 5 ppm was set for identified compounds; retention time in the range of ± 0.1 min was allowed to match the reference substance 21 .The predicted fragments generated from the structures were matched and identified against the mzCloud database and ChemSpider.Supporting information was obtained from relevant literature in databases such as PubMed.

Multivariate statistical analysis
The differences between WRFS and CRFS were explored using a metabolomics workflow.Multivariate statistical analysis was performed using SIMCA-P 14.0, and unsupervised principal component analysis (PCA) was used to obtain an initial understanding of the relationships between the data matrices.First, PCA was used to show pattern recognition and maximum variation to obtain an overview and classification.Second, the metabolite differences between different varieties of RFS and culture methods were detected using orthogonal projections to latent structures discriminant analysis (OPLS-DA) monitoring.OPLS-DA in ESI + and ESI − modes was performed to obtain the maximum separation between the CRFS and WRFS groups and to explore the potential biochemical markers contributing to the differences.S-plots were created to visualize the OPLS-DA predictive component loading to facilitate model interpretation.The corresponding variable importance for projection (VIP) was calculated in the OPLS-DA model, and VIP values were used to screen the different components.Metabolites with a VIP value of > 1 and a p-value of < 0.05 were considered potential markers.A heatmap was generated from these biochemical markers to visualize the variations in differential metabolites in the different groups, and metabolites with significant statistical differences among the classes were used to generate a heatmap in MetaboAnalyst4.0(www.metab oanal yst.ca) 22 .We annotated the obtained differential metabolites using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and identified the corresponding pathways.

Stability of the UPLC-MS/MS system
QC samples were used to evaluate the stability of the UPLC-MS/MS system.The curve overlaps between the metabolite detection and total ion current were high.The relative standard deviations (RSD) of the areas of all peaks were calculated, and the screening rates of the characteristic RSD < 30% in the positive and negative modes were 98.54% and 98.33%, respectively.These results suggest a high stability of the UPLC-MS/MS system throughout the experiment.The similarity between the two BPI chromatograms was relatively high.Using Compound Discoverer 3.2 (Table 2), eighty one compounds were characterized from CRFS and WRFS, which were equivalent to [M + H] + and [M−H] − ions and were unambiguously or tentatively identified through a match with accurate molecular weights within a mass accuracy of < 5 ppm.Both types of RFS extracts were rich in compounds with various structural patterns, including flavonoids, coumarins, alkaloids, glycosides, organic acids, and organic acid esters.In addition, the ion chromatograms and mass spectra of psoralen and apigenin standards were compared, as shown in Fig. 3; the secondary fragment peaks were consistent with those of the corresponding compounds in Table 2, indicating the accuracy of compound identification by CD 3.2 software.
PCA is an important method for the dimensionality reduction of data and an unsupervised multivariate statistical pattern recognition method, and may be used to highlight specific samples from all data.The PCA score plots of WRFS and CRFS showed substantial aggregation separation (Fig. 4A,B).To evaluate the differences in RFS between different cultivation methods and to understand the variables responsible for sample separation, we determined the importance of the variables in the OPLS-DA scoring charts, S-charts, permutation tests, and projection values.OPLS-DA differs from PCA because it is a supervised discriminant analysis method with superior classification and prediction capabilities.OPLS-DA uses partial least squares regression to establish a relationship model between metabolite expression and sample categories to predict the sample categories.Therefore, the OPLS-DA method was used to determine the differences between WRFS and CRFS components.The WRFS samples were separated from CRFS samples in the OPLS-DA score plot (Fig. 4C,D), suggesting differences in biochemistry between WRFS and CRFS.
The data processed by Compound Discoverer 3.2 was imported into SIMCA-P 14.0 software, and unsupervised PCA was used to evaluate the classification trend and differences between groups.The R 2 X of the model in positive and negative ion mode was greater than 0.4 (0.492 and 0.522, respectively), indicating that the model was stable and reliable.Two hundred rounds of random permutations were performed to verify the established Vol:.( 1234567890  A heatmap was generated based on these markers to evaluate them systematically and intuitively (Fig. 5), and to show the strength of potential chemical markers between two samples.The close relationship of 17 potential markers is illustrated by combining the identification results as mentioned above.The samples were divided into two categories, WRFS and CRFS, and the results were consistent with those of the PCA.The      3).The color indicates the signal strength of each metabolite; the darker the red, the greater the extent to which the metabolite appears above the average level of the sample, and blue indicates that the metabolite is at a lower level.

Kyoto Encyclopedia of Genes and Genomes analysis of differential metabolites
The KEGG database integrates genome, chemistry, and system function information and is a comprehensive dataset of metabolic pathway information [23][24][25] .The metabolic pathways are classified into different modules according to their functions, such as glycolysis, carbohydrate, TCA cycle, nucleoside and amino acid, organic compound and enzyme biodegradation, and other comprehensive metabolic pathways.Among the 17 differential metabolites, 14 were annotated to the KEGG database, 11 of which were annotated 29 times to KEGG pathways (Table 4).After removing duplication, 15 KEGG pathways were identified.Phenylpropanoid biosynthesis in the KEGG pathway is an example (Fig. 6).

Discussion
This study, a metabolomics study based on UPLC-Q-Orbitrap HRMS combined with multivariate statistical analysis revealed substantial differences in the compound compositions of WRFS and CRFS.The results of the identification and analysis of eighty one compounds showed distinct chemical profiles between WRFS and CRFS samples from different cultivation methods.Moreover, the identification results of these compounds in this study are consistent with those of Cheng Jun et al. 26,27 , which proves that RFS mainly contains phenylpropanoids, flavonoids, coumarins and other substances.Moreover, the chemical composition identification of RFS by Lao et al. 28 and Zhao et al. 29 showed that Vitexin, Vanillin, Luteolin, Psoralen, Apigenin, Bergapten, Ursolic acid and so on (17 potential differences between CRFS and WRFS) were consistent with our identification results.Using multivariate statistical analysis and a heatmap, WRFS and CRFS showed remarkable discrimination.Many markers exhibited different expression levels between the two samples.Psoralen, bergapten, and apigenin were upregulated in WRFS, and the content of these three active substances was much higher in WRFS than in CRFS.Many researchers have found that Psoralen, bergapten, and apigenin can be used as a quality marker of RFS, and it is the active ingredient with the highest content 30,31 .
Radix Fici Simplicissimae, one of ten famous medicines in Lingnan region, has been proven to play a role in protecting the liver, relieving inflammation, and having antioxidant and anti-cancer activities 32 .The ethanol extract of RFS can protect the liver of mice from alcohol-induced liver injury, probably by inducing and regulating downstream antioxidant factors, and also by suppressing the abnormal activation of CYP2E1 protein, reducing oxidative stress, and ultimately reducing the damage to the liver caused by alcohol 33 .Zhou Tiannong et al. 34 found that compared with the control group, the water extract of RFS can significantly inhibit the increase of abdominal capillary diameter and improve the pain threshold of mice.It can also reduce the levels of alanine aminotransferase (ALT) and aspartate aminotransferase (AST) in mouse serum, which have good anti-inflammatory, analgesic, and liver protective effects.Deep research on the active components of RFS shows that it mainly consists of phenolic acids, terpenoids, flavonoids, coumarins, and phenolic acids 26,35 .Many scholars [36][37][38][39] believe that the active components with a pharmacological effect that can be used as quality markers are psoralen, biflavonoids, and apigenin.Therefore, it should be studied as the main index.Psoralen, biflavonoids and apigenin have antitumor 40,41 , neuroprotective 42 , anti-inflammatory 43 , antioxidant and other pharmacological activities, which can be used to treat cancer, insomnia, Alzheimer's disease, rheumatoid arthritis, and aging.Psoralen and biflavonoids can prevent osteoporosis 44 , while apigenin can enhance immunity and prevent hypertension, arteriosclerosis, and cardiovascular and cerebrovascular diseases 45 .Because these different metabolites play an important role in health-related effects, these three components are very important for the quality evaluation of RFS.Our results show that the quality of WRFS is better than that of CRFS.
The main metabolic pathways that differ between WRFS and CRFS include primary and secondary metabolite biosynthesis.Psoralen, apigenin, and biflavonoids are annotated in multiple KEGG pathways related to phenylpropanoid biosynthesis, flavonoid biosynthesis, flavone and flavonol biosynthesis, and so on.Phenylpropanoid biosynthesis is an important metabolic process in humans, mainly involving the metabolism of amino acids such as phenylalanine and tyrosine.The process involves participation of various enzymes in catalyzing reactions to convert phenylalanine into other amino acids such as tyrosine.This biochemical process is crucial for the normal functioning of many physiological functions in the human body 46 .Flavonoids are an important branch of the phenylpropanoid metabolic pathway.The biosynthesis of flavonoids begins with phenylalanine, which is catalyzed by enzymes such as chalcone synthase to produce chalcone.Subsequently, the chalcone isomerizes into flavonoids, which then produces a variety of other flavonoid compounds, such as flavonols, isoflavones, and anthocyanins 47 .In addition, flavonoid and flavonols are two important components of flavonoids.Therefore, the results of this study provide clues for analyzing these metabolites and their metabolic networks in RFS.The variety and quantity of RFS collected in this study are limited, and its limitation should be attributed to the lack of sufficient sample size to support the research results, which can be expanded for further exploration.
This study showed that WRFS was superior to CRFS in quality, and explained the effects of different growth environments and drug varietie on secondary metabolites, and provides insights for further targeted breeding of improved CRFS varieties.

Conclusion
In this study, a UPLC-Q-Orbitrap HRMS method was established and successfully applied to determine the component profiles of various RFS samples grown under different cultivation methods.Using multivariate statistical analysis and heat maps, WRFS and CRFS were shown to have significant differences.Psoralen, bergapten, and apigenin were significantly upregulated in WRFS compared to CRFS.Due to the important roles of these differential metabolites, our results indicate that the quality of WRFS is superior to that of CRFS, and this strategy will benefit the process of quality evaluation of RFS formulations.

Figure 4 .
Figure 4. Principal component analysis (PCA) of WRFS and CRFS in ESI + (A) and ESI − (B) mode.OPLS-DA score plot with multivariate statistical analysis WRFS and CRFS in ESI + (C) and ESI − (D) mode.Crossvalidation plot of OPLS-DA model with 200 permutation tests in ESI + (E) and ESI − (F) mode.OPLS-DA S-plot in ESI + (G) and ESI − (H) mode.(The red marked points in red of the S-plot graph G and H are potential chemical markers).

Table 2 .
17 potential UPLC-Q-Orbitrap HRMS-based determination of chemical composition of CRFS and WRFS cultivated in different ways.

Table 3 .
Seventeen differential metabolites.Up: compared with CRFS, the corresponding metabolite was upregulated in WRFS.Down: compared with CRFS, the corresponding metabolite was downregulated in WRFS.

Table 4 .
Categories of 14 differential metabolite-annotated KEGG pathways.ID Annotation, ID of KEGG pathway; Number, the number of metabolites that can be annotated to the corresponding KEGG pathways; Matching IDs, Number of compounds in the KEGG pathway.
Figure 6.Phenylpropanoid biosynthesis.The compounds marked with red are differential metabolites belonging to phenylpropanes.